In the

mathematical Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...

fields of

geometry Geometry (; ) is, with arithmetic, one of the oldest branches of mathematics. It is concerned with properties of space such as the distance, shape, size, and relative position of figures. A mathematician who works in the field of geometry is c ...

and

linear algebra Linear algebra is the branch of mathematics concerning linear equations such as: :a_1x_1+\cdots +a_nx_n=b, linear maps such as: :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matrices. ...

, a principal axis is a certain line in a

Euclidean space Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's Elements, Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics ther ...

associated with an

ellipsoid An ellipsoid is a surface that may be obtained from a sphere by deforming it by means of directional scalings, or more generally, of an affine transformation. An ellipsoid is a quadric surface; that is, a surface that may be defined as the ...

hyperboloid In geometry, a hyperboloid of revolution, sometimes called a circular hyperboloid, is the surface generated by rotating a hyperbola around one of its principal axes. A hyperboloid is the surface obtained from a hyperboloid of revolution by defo ...

, generalizing the major and minor

axes Axes, plural of ''axe'' and of ''axis'', may refer to * ''Axes'' (album), a 2005 rock album by the British band Electrelane * a possibly still empty plot (graphics) See also *Axess (disambiguation) *Axxess (disambiguation) Axxess may refer to: ...

of an

ellipse In mathematics, an ellipse is a plane curve surrounding two focus (geometry), focal points, such that for all points on the curve, the sum of the two distances to the focal points is a constant. It generalizes a circle, which is the special ty ...

hyperbola In mathematics, a hyperbola (; pl. hyperbolas or hyperbolae ; adj. hyperbolic ) is a type of smooth curve lying in a plane, defined by its geometric properties or by equations for which it is the solution set. A hyperbola has two pieces, cal ...

. The principal axis theorem states that the principal axes are perpendicular, and gives a constructive procedure for finding them. Mathematically, the principal axis theorem is a generalization of the method of

completing the square : In elementary algebra, completing the square is a technique for converting a quadratic polynomial of the form :ax^2 + bx + c to the form :a(x-h)^2 + k for some values of ''h'' and ''k''. In other words, completing the square places a perfe ...

from

elementary algebra Elementary algebra encompasses the basic concepts of algebra. It is often contrasted with arithmetic: arithmetic deals with specified numbers, whilst algebra introduces variables (quantities without fixed values). This use of variables entai ...

. In

and

functional analysis Functional analysis is a branch of mathematical analysis, the core of which is formed by the study of vector spaces endowed with some kind of limit-related structure (e.g. Inner product space#Definition, inner product, Norm (mathematics)#Defini ...

, the principal axis theorem is a geometrical counterpart of the

spectral theorem In mathematics, particularly linear algebra and functional analysis, a spectral theorem is a result about when a linear operator or matrix (mathematics), matrix can be Diagonalizable matrix, diagonalized (that is, represented as a diagonal matrix i ...

. It has applications to the

statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...

principal components analysis Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and ...

and the

singular value decomposition In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any \ m \times n\ matrix. It is related ...

. In

physics Physics is the natural science that studies matter, its fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge which r ...

, the theorem is fundamental to the studies of

angular momentum In physics, angular momentum (rarely, moment of momentum or rotational momentum) is the rotational analog of linear momentum. It is an important physical quantity because it is a conserved quantity—the total angular momentum of a closed syst ...

and

birefringence Birefringence is the optical property of a material having a refractive index that depends on the polarization and propagation direction of light. These optically anisotropic materials are said to be birefringent (or birefractive). The birefring ...

Motivation

The equations in the

Cartesian plane A Cartesian coordinate system (, ) in a plane is a coordinate system that specifies each point uniquely by a pair of numerical coordinates, which are the signed distances to the point from two fixed perpendicular oriented lines, measured in t ...

R²: :

\frac - \frac &= 1 \end

define, respectively, an ellipse and a hyperbola. In each case, the ''x'' and ''y'' axes are the principal axes. This is easily seen, given that there are no ''cross-terms'' involving products ''xy'' in either expression. However, the situation is more complicated for equations like :

5x^2 + 8xy + 5y^2 = 1.

Here some method is required to determine whether this is an

or a

. The basic observation is that if, by completing the square, the quadratic expression can be reduced to a sum of two squares then the equation defines an ellipse, whereas if it reduces to a difference of two squares then the equation represents a hyperbola: :

\begin
  u(x, y)^2 + v(x, y)^2 &= 1\qquad \text \\
  u(x, y)^2 - v(x, y)^2 &= 1\qquad \text.
\end

Thus, in our example expression, the problem is how to absorb the coefficient of the cross-term 8''xy'' into the functions ''u'' and ''v''. Formally, this problem is similar to the problem of

matrix diagonalization In linear algebra, a square matrix A is called diagonalizable or non-defective if it is similar to a diagonal matrix, i.e., if there exists an invertible matrix P and a diagonal matrix D such that or equivalently (Such D are not unique.) F ...

, where one tries to find a suitable coordinate system in which the matrix of a linear transformation is diagonal. The first step is to find a matrix in which the technique of diagonalization can be applied. The trick is to write the quadratic form as :

5x^2 + 8xy + 5y^2 =
  \begin
    x & y
  \end
  \begin
    5 & 4 \\
    4 & 5
  \end
  \begin
    x \\ y
  \end =
  \mathbf^\textsf A\mathbf

where the cross-term has been split into two equal parts. The matrix ''A'' in the above decomposition is a

symmetric matrix In linear algebra, a symmetric matrix is a square matrix that is equal to its transpose. Formally, Because equal matrices have equal dimensions, only square matrices can be symmetric. The entries of a symmetric matrix are symmetric with re ...

. In particular, by the

, it has

real Real may refer to: Currencies * Brazilian real (R$) * Central American Republic real * Mexican real * Portuguese real * Spanish real * Spanish colonial real Music Albums * ''Real'' (L'Arc-en-Ciel album) (2000) * ''Real'' (Bright album) (2010) ...

eigenvalues In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...

and is

diagonalizable In linear algebra, a square matrix A is called diagonalizable or non-defective if it is similar to a diagonal matrix, i.e., if there exists an invertible matrix P and a diagonal matrix D such that or equivalently (Such D are not unique.) F ...

by an

orthogonal matrix In linear algebra, an orthogonal matrix, or orthonormal matrix, is a real square matrix whose columns and rows are orthonormal vectors. One way to express this is Q^\mathrm Q = Q Q^\mathrm = I, where is the transpose of and is the identity ma ...

(''orthogonally diagonalizable''). To orthogonally diagonalize ''A'', one must first find its eigenvalues, and then find an

orthonormal In linear algebra, two vectors in an inner product space are orthonormal if they are orthogonal (or perpendicular along a line) unit vectors. A set of vectors form an orthonormal set if all vectors in the set are mutually orthogonal and all of un ...

eigenbasis In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...

. Calculation reveals that the eigenvalues of ''A'' are :

\lambda_1 = 1,\quad \lambda_2 = 9

with corresponding eigenvectors :

\mathbf_1 = \begin 1 \\ -1 \end,\quad
  \mathbf_2 = \begin 1 \\  1 \end.

Dividing these by their respective lengths yields an orthonormal eigenbasis: :

\mathbf_1 = \begin 1/\sqrt \\ -1/\sqrt \end,\quad
  \mathbf_2 = \begin 1/\sqrt \\  1/\sqrt \end.

Now the matrix ''S'' = ''u₁ u₂is an orthogonal matrix, since it has orthonormal columns, and ''A'' is diagonalized by: :

A = SDS^ = SDS^\textsf =
  \begin
     1/\sqrt & 1/\sqrt\\
    -1/\sqrt & 1/\sqrt
  \end
  \begin
    1 & 0 \\
    0 & 9
  \end
  \begin
    1/\sqrt & -1/\sqrt \\
    1/\sqrt &  1/\sqrt
  \end.

This applies to the present problem of "diagonalizing" the quadratic form through the observation that :

5x^2 + 8xy + 5y^2 =
  \mathbf^\textsf A\mathbf =
  \mathbf^\textsf\left(SDS^\textsf\right)\mathbf =
  \left(S^\textsf \mathbf\right)^\textsf D\left(S^\textsf \mathbf\right) =
  1\left(\frac\right)^2 + 9\left(\frac\right)^2.

Thus, the equation

5x^2 + 8xy + 5y^2 = 1

is that of an ellipse, since the left side can be written as the sum of two squares. It is tempting to simplify this expression by pulling out factors of 2. However, it is important ''not'' to do this. The quantities :

c_1 = \frac,\quad c_2 = \frac

have a geometrical meaning. They determine an ''orthonormal coordinate system'' on R². In other words, they are obtained from the original coordinates by the application of a rotation (and possibly a reflection). Consequently, one may use the ''c''₁ and ''c''₂ coordinates to make statements about ''length and angles'' (particularly length), which would otherwise be more difficult in a different choice of coordinates (by rescaling them, for instance). For example, the maximum distance from the origin on the ellipse ''c''₁² + 9''c''₂² = 1 occurs when ''c''₂ = 0, so at the points ''c''₁ = ±1. Similarly, the minimum distance is where ''c''₂ = ±1/3. It is possible now to read off the major and minor axes of this ellipse. These are precisely the individual

eigenspace In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...

s of the matrix ''A'', since these are where ''c''₂ = 0 or ''c''₁ = 0. Symbolically, the principal axes are :

E_1 = \text\left(\begin 1/\sqrt \\ -1/\sqrt \end\right),\quad
  E_2 = \text\left(\begin 1/\sqrt \\  1/\sqrt \end\right).

To summarize: * The equation is for an ellipse, since both eigenvalues are positive. (Otherwise, if one were positive and the other negative, it would be a hyperbola.) * The principal axes are the lines spanned by the eigenvectors. * The minimum and maximum distances to the origin can be read off the equation in diagonal form. Using this information, it is possible to attain a clear geometrical picture of the ellipse: to graph it, for instance.

Formal statement

The principal axis theorem concerns

quadratic forms In mathematics, a quadratic form is a polynomial with terms all of degree two ("form" is another name for a homogeneous polynomial). For example, :4x^2 + 2xy - 3y^2 is a quadratic form in the variables and . The coefficients usually belong to ...

in R^''n'', which are

homogeneous polynomial In mathematics, a homogeneous polynomial, sometimes called quantic in older texts, is a polynomial whose nonzero terms all have the same degree. For example, x^5 + 2 x^3 y^2 + 9 x y^4 is a homogeneous polynomial of degree 5, in two variables; t ...

s of degree 2. Any quadratic form may be represented as :

Q(\mathbf) = \mathbf^\textsf A\mathbf

where ''A'' is a symmetric matrix. The first part of the theorem is contained in the following statements guaranteed by the spectral theorem: * The eigenvalues of ''A'' are real. * ''A'' is diagonalizable, and the eigenspaces of ''A'' are mutually orthogonal. In particular, ''A'' is ''orthogonally diagonalizable'', since one may take a basis of each eigenspace and apply the Gram-Schmidt process separately within the eigenspace to obtain an orthonormal eigenbasis. For the second part, suppose that the eigenvalues of ''A'' are λ₁, ..., λ_''n'' (possibly repeated according to their algebraic multiplicities) and the corresponding orthonormal eigenbasis is u₁, ..., u_''n''. Then, :

\textsf \mathbf,

and :

Q(\mathbf) = \lambda_1 c_1^2 + \lambda_2 c_2^2 + \dots + \lambda_n c_n^2,

where ''c''_''i'' is the ''i''-th entry of c . Furthermore, : The ''i''-th principal axis is the line determined by equating ''c''_''j'' =0 for all

j = 1,\ldots, i-1, i+1,\ldots, n

. The ''i''-th principal axis is the span of the vector u_''i'' .

References

* {{cite book, authorlink=Gilbert Strang, first=Gilbert, last=Strang, title=Introduction to Linear Algebra, publisher=Wellesley-Cambridge Press, year=1994, isbn=0-9614088-5-5 Theorems in geometry Theorems in linear algebra